Technique For Clustering Uncertain Data Based On Probability Distribution Similarity
نویسنده
چکیده
: Clustering on uncertain data, one of the essential tasks in data mining. The traditional algorithms like K-Means clustering, UK Means clustering, density based clustering etc, to cluster uncertain data are limited to using geometric distance based similarity measures and cannot capture the difference between uncertain data with their distributions. Such methods cannot handle uncertain objects that are geometrically indistinguishable, such as products with the same mean but very different variances in customer ratings. In the case of K medoid clustering of uncertain data on the basis of their KL divergence similarity, they cluster the data based on their probability distribution similarity. Several methods have been proposed for the clustering of uncertain data. Some of these methods are reviewed. Compared to the traditional clustering methods, K-Medoid clustering algorithm based on KL divergence similarity is more efficient. First the probability distribution method for model uncertain data object then after that measure the similarity between data objects using distance metrics, then finally best clustering methods such as partition clustering, density based clustering. This paper proposes a new met for making the algorithm more effective with the consideration of effective with the consideration of initial selection of med
منابع مشابه
Clustering Multi-Attribute Uncertain Data using Probability Distribution
Clustering is an unsupervised classification technique for grouping set of abstract objects into classes of similar objects. Clustering uncertain data is one of the essential tasks in mining uncertain data. Uncertain data is typically found in the area of sensor networks, weather data, customer rating data etc. The earlier methods for clustering uncertain data based on probability distribution,...
متن کاملA Review of Clustering Algorithms for Clustering Uncertain Data
Clustering is an important task in the Data Mining. Clustering on uncertain data is a challenging in both modeling similarity between objects of uncertain data and developing efficient computational method. The most of the previous method extends partitioning clustering methods and Density based clustering methods, which are based on geometrical distance between two objects. Such method cannot ...
متن کاملClustering on Uncertain Data using Kullback Leibler Divergence Measurement based on Probability Distribution
Cluster analysis is one of the important data analysis methods and is a very complex task. It is the art of a detecting group of similar objects in large data sets without requiring specified groups by means of explicit features or knowledge of data. Clustering on uncertain data is a most difficult task in both modeling similarity between uncertain data objects and developing efficient computat...
متن کاملAn Efficient Divergence and Distribution Based Similarity Measure for Clustering Of Uncertain Data
Data Mining is the extraction of hidden predictive information from large databases. Clustering is one of the popular data mining techniques. Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modeling similarity between uncertain objects and developing efficient computational methods. The previous methods extend traditional p...
متن کاملA Novel And Improved Technique For Clustering Uncertain Data
Clustering on uncertain data, one of the essential tasks in data mining. The traditional algorithms like K-Means clustering, UK Means clustering, density based clustering etc, to cluster uncertain data are limited to using geometric distance based similarity measures and cannot capture the difference between uncertain data with their distributions. Such methods cannot handle uncertain objects t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015